Russian Word Prediction with Morphological Support

نویسندگان

  • Sheri Hunnicutt
  • Lela Nozadze
  • George Chikoidze
چکیده

A co-operative project between two research groups in Tbilisi and Stockholm began in 1996. Its purpose is to extend a word predictor developed by the Swedish partner to the Russian language. Since Russian is much richer in morphological forms than the 7 languages previously worked with, an additional morphological component, using an algorithm supplied by the group in Tbilisi, is seen as necessary. It will provide inflectional categories and resulting inflections for verbs, nouns and adjectives. The correct word forms can then be presented to the user of the word prediction system in a consistent manner, allowing the user to easily choose the desired inflectional word form. At present, the work with the classification of verbs is complete. The algorithm is also being used to automatically tag the large lexicon used in the word predictor with inflectional classes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Russian Morphological Analysis

In this paper the approach to the organization of Russian inflexion morphologic model and its application for the Russian language morphological analysis and disambiguation are described. We are concerned with the pos tagging of 150-million-word Russian corpora. The approach is particularly dependent on the language processor Russicon, and on wide usage of Russicon's electronic dictionaries.

متن کامل

Morphological Analysis for Russian: Integration and Comparison of Taggers

In this paper we present a comparison of three morphological taggers for Russian with regard to the quality of morphological disambiguation performed by these taggers. We test the quality of the analysis in three different ways: lemmatization, POS-tagging and assigning full morphological tags. We analyze the mistakes made by the taggers, outline their strengths and weaknesses, and present a pos...

متن کامل

Applying Morphology Generation Models to Machine Translation

We improve the quality of statistical machine translation (SMT) by applying models that predict word forms from their stems using extensive morphological and syntactic information from both the source and target languages. Our inflection generation models are trained independently of the SMT system. We investigate different ways of combining the inflection prediction component with the SMT syst...

متن کامل

Building Russian Word Sketches as Models of Phrases

The paper describes the writing of Sketch Grammar for the Russian language as a part of the Sketch Engine system. The Sketch Engine representing itself a corpus tool which takes as input a corpus of any language and corresponding grammar patterns. The system gives information about a word’s collocability on concrete dependency models, and generates lists of the most frequent phrases for a given...

متن کامل

Automated Word Stress Detection in Russian

In this study we address the problem of automated word stress detection in Russian using character level models and no partspeech-taggers. We use a simple bidirectional RNN with LSTM nodes and achieve the accuracy of 90% or higher. We experiment with two training datasets and show that using the data from an annotated corpus is much more efficient than using a dictionary, since it allows us to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003